Info¶
W skrócie:
- zrobiłem zestaw emebddingów na przedmiotach
- nowy przedmiot tranformuję na embedding
- szukam najbardziej podobnego przedmiotu w przygotowanym zestawie
- przydzielam klasę nowemu przedmiotowi wg tego najbardziej podobnego
Ze szczegółami:
- podzieliłem excela na dane test (1000 wierszy) i train (reszta, ok 3tys) tak żeby zachwoać proprocje w klasie
main - z danych
trainzrobiłem bazę embeddingów:- akapit tekstowy złożony z
supplier_name,supplier_reference_descriptionipurchase_price - model generujące embeddingi to klasyczny
sentence-transformers/all-mpnet-base-v2
- akapit tekstowy złożony z
- dla każdego wiersza w danych
test- tworzę analogiczny akapit tekstowy
- w bazie mebeddingów wybieram najbardziej podbny wg metryki
cosine - biorę predykcję klasy
main - zawężam zestaw bazowy/treningowy do wierszy z podaną klasą
main - szukam jeszcze raz najbardziej podobnego embeddingu i wybeiram klasę
sub - zawężam zestaw bazowy/treningowy do wierszy z podaną klasą
subi analogicznie szukam kalsydetails - powtarzam zawężanie i szukanie aby znalaźeć ostatnią klasę
level4
Metryka poprawności klasyfikacji:
- odsetek poprawnie zaklasyfikowanych przedmiotó ze zbioru
test
Ograniczenia, błędy:
- zbiór bazowy/treningowy musi być aktualny w sotsunku do nowych przedmiotów
Importy¶
In [1]:
import pandas as pd
import plotly.graph_objects as go
import plotly.io as pio
from sentence_transformers import SentenceTransformer
from numpy import dot, argmax
from numpy.linalg import norm
from tqdm import tqdm
pio.templates.default = "plotly_dark"
d:\Projects\Black Hippo\eda\venv\Lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
Parametry¶
In [11]:
MAIN_CLASSES = [
"Furniture",
"Lighting",
"Home Textiles",
"Tableware",
"Decoration",
"Flowers & Plants"
]
TEST_ROWS = 1000
Utils¶
In [12]:
def train_test_split(raw_df: pd.DataFrame):
# fill na
df = raw_df[raw_df["main"].isin(MAIN_CLASSES)]
for col in ["main", "sub", "detail", "level4"]:
df[col] = df[col].fillna("Unspecified")
ratios = df["main"].value_counts(normalize=True).to_dict()
df = df.sample(len(df)) # shuffle data
test_df = pd.DataFrame()
for main_class, ratio in ratios.items():
new_df = df[df["main"] == main_class].sample(int(TEST_ROWS*ratio))
test_df = pd.concat([test_df, new_df])
if len(test_df) < TEST_ROWS:
diff = TEST_ROWS - len(test_df)
test_df = pd.concat([
test_df,
df[~(df["item_id"].isin(test_df["item_id"]))].sample(diff)
])
train_df = df[~(df["item_id"].isin(test_df["item_id"]))]
return test_df, train_df
In [5]:
def get_embedder(model_id: str) -> SentenceTransformer:
match model_id:
case "sentence-transformers/all-mpnet-base-v2":
return SentenceTransformer("sentence-transformers/all-mpnet-base-v2")
case _:
raise ValueError
In [6]:
def generate_embedding_from_text(
model: SentenceTransformer,
text_data: list[str]
) -> list[list[float]]:
results = []
for x in tqdm(text_data):
embedding = model.encode([x])[0]
results.append(embedding)
return results
In [7]:
def row_to_text_input(df: pd.DataFrame, i: int) -> str:
text = f"""
Supplier name = {df["supplier_name"].iloc[i]}
Product name = {df["supplier_reference_description"].iloc[i]}
Product price = {df["purchase_price"].iloc[i]}
"""
return text
In [8]:
def cosine_sim(a, b) -> float:
return float(dot(a, b)/(norm(a)*norm(b)))
In [9]:
def generate_ratio_df(errors_df: pd.DataFrame, test_df: pd.DataFrame, col: str):
error_ratios = errors_df[col].value_counts(normalize=True).reset_index().rename(columns={"proportion": "ratio_in_errors"})
test_ratios = test_df[col].value_counts(normalize=True).reset_index().rename(columns={"proportion": "ratio_in_tests"})
ratios_df = pd.merge(
left=error_ratios,
right=test_ratios,
on=col,
how="right"
).round(2).fillna(0)
ratios_df["diff"] = ratios_df["ratio_in_errors"] - ratios_df["ratio_in_tests"]
print(f'r Pearson Correlation = {round(ratios_df[["ratio_in_errors", "ratio_in_tests"]].corr()["ratio_in_tests"].iloc[0], 3)}')
return ratios_df
Predykcje¶
In [ ]:
raw_df = pd.read_csv("../resources/item data 2026_AW(Sheet1).csv", sep=",")
In [10]:
embedder = get_embedder("sentence-transformers/all-mpnet-base-v2")
In [13]:
test_df, train_df = train_test_split(raw_df)
C:\Users\bawo\AppData\Local\Temp\ipykernel_1284\3270090587.py:5: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[col] = df[col].fillna("Unspecified")
Embeddingi treninogwe / bazowe¶
In [14]:
text_inputs = [
row_to_text_input(train_df, i)
for i in range(len(train_df))
]
base_embeddings = generate_embedding_from_text(
model=embedder,
text_data=text_inputs
)
train_df["embedding"] = base_embeddings
100%|██████████| 3054/3054 [02:57<00:00, 17.22it/s]
Embeddingi "nowych" przedmiotów¶
In [15]:
text_inputs = [
row_to_text_input(test_df, i)
for i in range(len(test_df))
]
test_embeddings = generate_embedding_from_text(
model=embedder,
text_data=text_inputs
)
100%|██████████| 1000/1000 [01:27<00:00, 11.44it/s]
Znajdź najbardziej podobne przedmioty¶
In [16]:
pred_main, pred_sub, pred_detail, pred_level4 = [], [], [], []
for test_idx in tqdm(range(len(test_df))):
embedding = test_embeddings[test_idx]
# main prdiction
sim_scores = [cosine_sim(embedding, x) for x in base_embeddings]
best_idx = argmax(sim_scores)
main = train_df["main"].iloc[best_idx]
# sub prediction
train_df_selected = train_df[train_df["main"] == main]
base_embeddings_selected = train_df_selected["embedding"].to_list()
sim_scores = [cosine_sim(embedding, x) for x in base_embeddings_selected]
best_idx = argmax(sim_scores)
sub = train_df_selected["sub"].iloc[best_idx]
# detail prediction
train_df_selected = train_df_selected[train_df_selected["sub"] == sub]
base_embeddings_selected = train_df_selected["embedding"].to_list()
sim_scores = [cosine_sim(embedding, x) for x in base_embeddings_selected]
best_idx = argmax(sim_scores)
detail = train_df_selected["detail"].iloc[best_idx]
# detail prediction
train_df_selected = train_df_selected[train_df_selected["detail"] == detail]
base_embeddings_selected = train_df_selected["embedding"].to_list()
sim_scores = [cosine_sim(embedding, x) for x in base_embeddings_selected]
best_idx = argmax(sim_scores)
level4 = train_df_selected["level4"].iloc[best_idx]
pred_main.append(main)
pred_sub.append(sub)
pred_detail.append(detail)
pred_level4.append(level4)
test_df["pred_main"] = pred_main
test_df["pred_sub"] = pred_sub
test_df["pred_detail"] = pred_detail
test_df["pred_level4"] = pred_level4
100%|██████████| 1000/1000 [01:07<00:00, 14.81it/s]
Oszacuj jakość¶
In [17]:
test_n = len(test_df)
main_success_ratio = len(test_df[test_df["main"] == test_df["pred_main"]]) / test_n
sub_success_ratio = len(test_df[test_df["sub"] == test_df["pred_sub"]]) / test_n
detail_success_ratio = len(test_df[test_df["detail"] == test_df["pred_detail"]]) / test_n
level4_success_ratio = len(test_df[test_df["level4"] == test_df["pred_level4"]]) / test_n
total_success_ratio = len(
test_df[(
(test_df["main"] == test_df["pred_main"])
& (test_df["sub"] == test_df["pred_sub"])
& (test_df["detail"] == test_df["pred_detail"])
& (test_df["level4"] == test_df["pred_level4"])
)]
) / test_n
print("main_success_ratio = ", round(main_success_ratio, 3))
print("sub_success_ratio = ", round(sub_success_ratio, 3))
print("detail_success_ratio = ", round(detail_success_ratio, 3))
print("level4_success_ratio = ", round(level4_success_ratio, 3))
print("total_success_ratio = ", round(total_success_ratio, 3))
main_success_ratio = 0.981 sub_success_ratio = 0.947 detail_success_ratio = 0.919 level4_success_ratio = 0.964 total_success_ratio = 0.907
Wizualziacja¶
In [25]:
fig = go.Figure()
fig.add_trace(
go.Bar(
# orientation="h",
x=[
"main",
"sub",
"detail",
"level4",
"total"
],
y=[
main_success_ratio,
sub_success_ratio,
detail_success_ratio,
level4_success_ratio,
total_success_ratio
],
text=[
main_success_ratio,
sub_success_ratio,
detail_success_ratio,
level4_success_ratio,
total_success_ratio
],
marker_color=[
"silver", "silver", "silver","silver", "teal"
]
)
)
fig.update_layout(
title="Successfull predictions",
width=1000,
height=600
)
fig.show(renderer="notebook")
Analiza błędów¶
In [19]:
errors_df = test_df[~(
(test_df["main"] == test_df["pred_main"])
& (test_df["sub"] == test_df["pred_sub"])
& (test_df["detail"] == test_df["pred_detail"])
& (test_df["level4"] == test_df["pred_level4"])
)]
Błędy¶
In [20]:
for i in range(len(errors_df)):
real_class = f'{errors_df["main"].iloc[i]} / {errors_df["sub"].iloc[i]} / {errors_df["detail"].iloc[i]} / {errors_df["level4"].iloc[i]}'
pred_class = f'{errors_df["pred_main"].iloc[i]} / {errors_df["pred_sub"].iloc[i]} / {errors_df["pred_detail"].iloc[i]} / {errors_df["pred_level4"].iloc[i]}'
print(f"Real = {real_class}\nPred = {pred_class}\n\n")
Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Home Accessories / Decorative Objects / Decorative Letters & Numbers Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Decoration Storage / Coat Racks / Unspecified Real = Decoration / Mirrors / Wall Mirrors / Unspecified Pred = Decoration / Decorative Materials / Unspecified / Unspecified Real = Decoration / Decoration Storage / Unspecified / Unspecified Pred = Decoration / Decoration Storage / Storage Boxes / Unspecified Real = Decoration / Decoration Storage / Unspecified / Unspecified Pred = Decoration / Decoration Storage / Storage Boxes / Unspecified Real = Decoration / Home Accessories / Figurines / Flowers & Plants Pred = Decoration / Flower Pots & Vases / Flower pots / Unspecified Real = Decoration / Home Accessories / Figurines / Fantasy Pred = Home Textiles / Cushions / Cushions & Cushion Covers / Unspecified Real = Decoration / Garden Accessories / Bird Houses & Cages / Unspecified Pred = Lighting / Wall Lamps / Unspecified / Unspecified Real = Decoration / Home Accessories / Other / Garlands & hangers Pred = Decoration / Home Accessories / Decorative Objects / Objects Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Candles & Candle Holders / Tealight Holders / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Home Accessories / Figurines / People Real = Decoration / Clocks / Table Clocks / Unspecified Pred = Decoration / Clocks / Wall Clocks / Unspecified Real = Decoration / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Figurines / Animals Real = Decoration / Home Accessories / Figurines / Animals Pred = Decoration / Flower Pots & Vases / Flower pots / Unspecified Real = Decoration / Clocks / Table Clocks / Unspecified Pred = Decoration / Clocks / Wall Clocks / Unspecified Real = Decoration / Flower Pots & Vases / Floor Vases / Unspecified Pred = Decoration / Flower Pots & Vases / Vases / Unspecified Real = Decoration / Decoration Storage / Unspecified / Unspecified Pred = Decoration / Decoration Storage / Storage Boxes / Unspecified Real = Decoration / Flower Pots & Vases / Vases / Unspecified Pred = Decoration / Flower Pots & Vases / Soliflores / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Decorative Trees Pred = Lighting / Decorative Lighting / Decorative Lighting / Unspecified Real = Decoration / Decoration Storage / Storage Jars / Unspecified Pred = Decoration / Flower Pots & Vases / Vases / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Home Accessories / Figurines / Animals Real = Decoration / Flower Pots & Vases / Vases / Unspecified Pred = Decoration / Flower Pots & Vases / Flower pots / Unspecified Real = Decoration / Wall Decoration / Prints & Posters / Unspecified Pred = Decoration / Candles & Candle Holders / Tealight Holders / Unspecified Real = Decoration / Flower Pots & Vases / Vases / Unspecified Pred = Decoration / Flower Pots & Vases / Floor Vases / Unspecified Real = Decoration / Home Accessories / Other / Garlands & hangers Pred = Lighting / Decorative Lighting / Decorative Lighting / Unspecified Real = Decoration / Home Accessories / Figurines / Animals Pred = Decoration / Home Accessories / Figurines / Flowers & Plants Real = Decoration / Candles & Candle Holders / Tealight Holders / Unspecified Pred = Decoration / Candles & Candle Holders / Hurricane Lights & Lanterns / Unspecified Real = Decoration / Home Accessories / Figurines / Animals Pred = Home Textiles / Soft Toys / Unspecified / Unspecified Real = Decoration / Garden Accessories / Fire Pits, Braziers & Fireplaces / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Decoration / Home Accessories / Figurines / People Pred = Home Textiles / Cushions / Cushions & Cushion Covers / Unspecified Real = Decoration / Wall Decoration / Paintings / Unspecified Pred = Decoration / Wall Decoration / Unspecified / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Decorative Trays Pred = Decoration / Home Accessories / Decorative Objects / Decorative Trees Real = Decoration / Home Accessories / Figurines / Animals Pred = Decoration / Home Accessories / Figurines / Flowers & Plants Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Furniture / Tables / Side Tables / Unspecified Real = Decoration / Wall Decoration / Paintings / Unspecified Pred = Decoration / Wall Decoration / Prints & Posters / Unspecified Real = Decoration / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Other / Garlands & hangers Real = Decoration / Flower Pots & Vases / Vases / Unspecified Pred = Decoration / Flower Pots & Vases / Floor Vases / Unspecified Real = Decoration / Home Accessories / Other / Garlands & hangers Pred = Decoration / Home Accessories / Unspecified / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Decorative Cones Pred = Decoration / Home Accessories / Decorative Objects / Decorative Trees Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Decorative Materials / Unspecified / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Home Accessories / Figurines / Fantasy Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Unspecified / Unspecified / Unspecified Real = Decoration / Flower Pots & Vases / Vases / Unspecified Pred = Decoration / Flower Pots & Vases / Floor Vases / Unspecified Real = Decoration / Home Accessories / Figurines / People Pred = Decoration / Home Accessories / Decorative Objects / Objects Real = Decoration / Home Accessories / Figurines / Animals Pred = Decoration / Unspecified / Unspecified / Unspecified Real = Decoration / Home Accessories / Other / Garlands & hangers Pred = Decoration / Home Accessories / Figurines / Fantasy Real = Decoration / Candles & Candle Holders / Tealight Holders / Unspecified Pred = Decoration / Flower Pots & Vases / Vases / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Home Accessories / Figurines / Fantasy Real = Decoration / Flower Pots & Vases / Flower pots / Unspecified Pred = Decoration / Flower Pots & Vases / Vases / Unspecified Real = Decoration / Wall Decoration / Paintings / Unspecified Pred = Decoration / Photo Frames / Photo Frames / Unspecified Real = Decoration / Candles & Candle Holders / Tealight Holders / Unspecified Pred = Decoration / Candles & Candle Holders / Hurricane Lights & Lanterns / Unspecified Real = Decoration / Home Accessories / Decorative Objects / Objects Pred = Decoration / Home Accessories / Figurines / Animals Real = Decoration / Home Accessories / Other / Garlands & hangers Pred = Decoration / Decorative Materials / Unspecified / Unspecified Real = Decoration / Flower Pots & Vases / Flower pots / Unspecified Pred = Decoration / Flower Pots & Vases / Vases / Unspecified Real = Furniture / Tables / Pedestals / Unspecified Pred = Decoration / Flower Pots & Vases / Flower pots / Unspecified Real = Furniture / Tables / Coffee Tables / Unspecified Pred = Furniture / Tables / Side Tables / Unspecified Real = Furniture / Storage / Shelving Units / Unspecified Pred = Decoration / Home Accessories / Decorative Objects / Decorative Trees Real = Furniture / Storage / Shelving Units / Unspecified Pred = Furniture / Tables / Coffee Tables / Unspecified Real = Furniture / Storage / Buffets / Unspecified Pred = Furniture / Storage / Cabinets & Sideboards / Unspecified Real = Furniture / Sofas & Armchairs / Armchairs / Unspecified Pred = Furniture / Chairs / Office Chairs / Unspecified Real = Furniture / Tables / Side Tables / Unspecified Pred = Furniture / Tables / Coffee Tables / Unspecified Real = Furniture / Storage / Bookcases / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Furniture / Tables / Coffee Tables / Unspecified Pred = Decoration / Decoration Storage / Storage Boxes / Unspecified Real = Furniture / Tables / Coffee Tables / Unspecified Pred = Decoration / Clocks / Wall Clocks / Unspecified Real = Furniture / Tables / Coffee Tables / Unspecified Pred = Furniture / Tables / Side Tables / Unspecified Real = Furniture / Storage / Chests of Drawers / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Furniture / Tables / Side Tables / Unspecified Pred = Furniture / Storage / Trolleys / Unspecified Real = Furniture / Storage / Ladder Shelves / Unspecified Pred = Furniture / Storage / Wall Shelves / Unspecified Real = Furniture / Tables / Coffee Tables / Unspecified Pred = Furniture / Tables / Side Tables / Unspecified Real = Tableware / Serveware / Teapots & Accessories / Unspecified Pred = Tableware / Unspecified / Unspecified / Unspecified Real = Tableware / Serveware / Teapots & Accessories / Unspecified Pred = Tableware / Dinnerware / Mugs / Unspecified Real = Tableware / Dinnerware / Mugs / Unspecified Pred = Tableware / Dinnerware / Bowls / Unspecified Real = Tableware / Glassware / Wine Glasses / Unspecified Pred = Tableware / Glassware / Champagne Glasses / Unspecified Real = Tableware / Serveware / Teapots & Accessories / Unspecified Pred = Tableware / Unspecified / Unspecified / Unspecified Real = Tableware / Dinnerware / Mugs / Unspecified Pred = Tableware / Serveware / Teapots & Accessories / Unspecified Real = Tableware / Glassware / Wine Glasses / Unspecified Pred = Tableware / Glassware / Champagne Glasses / Unspecified Real = Tableware / Wine & Bar Accessories / Decanters & Bottles / Unspecified Pred = Tableware / Glassware / Drinking Glasses / Unspecified Real = Tableware / Wine & Bar Accessories / Decanters & Bottles / Unspecified Pred = Tableware / Glassware / Drinking Glasses / Unspecified Real = Lighting / Floor Lamps / Unspecified / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Lighting / Floor Lamps / Unspecified / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Lighting / Lighting Accessories / Lamp Shades / Unspecified Pred = Lighting / Floor Lamps / Unspecified / Unspecified Real = Lighting / Desk & Table Lamps / Unspecified / Unspecified Pred = Lighting / Decorative Lighting / Decorative Lighting / Unspecified Real = Lighting / Lighting Accessories / Light bulbs / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Lighting / Floor Lamps / Unspecified / Unspecified Pred = Lighting / Desk & Table Lamps / Unspecified / Unspecified Real = Home Textiles / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Other / Doorstoppers Real = Home Textiles / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Figurines / People Real = Home Textiles / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Other / Doorstoppers Real = Home Textiles / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Other / Doorstoppers Real = Home Textiles / Unspecified / Unspecified / Unspecified Pred = Decoration / Home Accessories / Other / Doorstoppers Real = Flowers & Plants / Artificial Flowers / Unspecified / Unspecified Pred = Flowers & Plants / Unspecified / Unspecified / Unspecified Real = Flowers & Plants / Artificial Branches / Unspecified / Unspecified Pred = Flowers & Plants / Artificial Flowers / Unspecified / Unspecified Real = Flowers & Plants / Artificial Branches / Unspecified / Unspecified Pred = Flowers & Plants / Artificial Flowers / Unspecified / Unspecified Real = Flowers & Plants / Artificial Branches / Unspecified / Unspecified Pred = Flowers & Plants / Artificial Flowers / Unspecified / Unspecified
Reprezentatywność klas - korelacja proprocji klas w danych z błędami do danych testowych¶
- im wieskza, tym bardziej podobne proprocej klas
In [21]:
generate_ratio_df(errors_df, test_df, "main")
r Pearson Correlation = 0.988
Out[21]:
| main | ratio_in_errors | ratio_in_tests | diff | |
|---|---|---|---|---|
| 0 | Decoration | 0.58 | 0.73 | -0.15 |
| 1 | Furniture | 0.16 | 0.08 | 0.08 |
| 2 | Tableware | 0.10 | 0.07 | 0.03 |
| 3 | Lighting | 0.06 | 0.05 | 0.01 |
| 4 | Home Textiles | 0.05 | 0.05 | 0.00 |
| 5 | Flowers & Plants | 0.04 | 0.02 | 0.02 |
In [22]:
generate_ratio_df(errors_df, test_df, "sub")
r Pearson Correlation = 0.859
Out[22]:
| sub | ratio_in_errors | ratio_in_tests | diff | |
|---|---|---|---|---|
| 0 | Home Accessories | 0.30 | 0.32 | -0.02 |
| 1 | Candles & Candle Holders | 0.03 | 0.16 | -0.13 |
| 2 | Flower Pots & Vases | 0.09 | 0.14 | -0.05 |
| 3 | Wall Decoration | 0.04 | 0.04 | 0.00 |
| 4 | Decoration Storage | 0.04 | 0.03 | 0.01 |
| 5 | Unspecified | 0.08 | 0.03 | 0.05 |
| 6 | Sofas & Armchairs | 0.01 | 0.03 | -0.02 |
| 7 | Tables | 0.09 | 0.03 | 0.06 |
| 8 | Dinnerware | 0.02 | 0.02 | 0.00 |
| 9 | Desk & Table Lamps | 0.01 | 0.02 | -0.01 |
| 10 | Cushions | 0.00 | 0.02 | -0.02 |
| 11 | Home Fragrances | 0.00 | 0.02 | -0.02 |
| 12 | Decorative Lighting | 0.00 | 0.02 | -0.02 |
| 13 | Serveware | 0.03 | 0.01 | 0.02 |
| 14 | Storage | 0.06 | 0.01 | 0.05 |
| 15 | Wine & Bar Accessories | 0.02 | 0.01 | 0.01 |
| 16 | Artificial Flowers | 0.01 | 0.01 | 0.00 |
| 17 | Table & Kitchen Accessories | 0.00 | 0.01 | -0.01 |
| 18 | Chairs | 0.00 | 0.01 | -0.01 |
| 19 | Lighting Accessories | 0.02 | 0.01 | 0.01 |
| 20 | Glassware | 0.02 | 0.01 | 0.01 |
| 21 | Mirrors | 0.01 | 0.01 | 0.00 |
| 22 | Photo Frames | 0.00 | 0.00 | 0.00 |
| 23 | Soft Toys | 0.00 | 0.00 | 0.00 |
| 24 | Floor Lamps | 0.03 | 0.00 | 0.03 |
| 25 | Clocks | 0.02 | 0.00 | 0.02 |
| 26 | Artificial Branches | 0.03 | 0.00 | 0.03 |
| 27 | Artificial Trees | 0.00 | 0.00 | 0.00 |
| 28 | Ceiling Lamps | 0.00 | 0.00 | 0.00 |
| 29 | Blankets & Throws | 0.00 | 0.00 | 0.00 |
| 30 | Garden Accessories | 0.02 | 0.00 | 0.02 |
| 31 | Rugs | 0.00 | 0.00 | 0.00 |
| 32 | Bed Linen | 0.00 | 0.00 | 0.00 |
| 33 | Decorative Materials | 0.00 | 0.00 | 0.00 |
| 34 | Cutlery | 0.00 | 0.00 | 0.00 |
In [23]:
generate_ratio_df(errors_df, test_df, "detail")
r Pearson Correlation = 0.756
Out[23]:
| detail | ratio_in_errors | ratio_in_tests | diff | |
|---|---|---|---|---|
| 0 | Figurines | 0.10 | 0.14 | -0.04 |
| 1 | Tealight Holders | 0.03 | 0.13 | -0.10 |
| 2 | Decorative Objects | 0.15 | 0.12 | 0.03 |
| 3 | Vases | 0.05 | 0.10 | -0.05 |
| 4 | Unspecified | 0.19 | 0.09 | 0.10 |
| ... | ... | ... | ... | ... |
| 64 | Buffets | 0.01 | 0.00 | 0.01 |
| 65 | Chests of Drawers | 0.01 | 0.00 | 0.01 |
| 66 | Salad Servers | 0.00 | 0.00 | 0.00 |
| 67 | Ice Buckets | 0.00 | 0.00 | 0.00 |
| 68 | Light bulbs | 0.01 | 0.00 | 0.01 |
69 rows × 4 columns
In [24]:
generate_ratio_df(errors_df, test_df, "level4")
r Pearson Correlation = 0.988
Out[24]:
| level4 | ratio_in_errors | ratio_in_tests | diff | |
|---|---|---|---|---|
| 0 | Unspecified | 0.70 | 0.69 | 0.01 |
| 1 | Animals | 0.05 | 0.08 | -0.03 |
| 2 | Decorative Trees | 0.01 | 0.05 | -0.04 |
| 3 | Garlands & hangers | 0.05 | 0.04 | 0.01 |
| 4 | Objects | 0.12 | 0.03 | 0.09 |
| 5 | Fantasy | 0.01 | 0.02 | -0.01 |
| 6 | People | 0.02 | 0.02 | 0.00 |
| 7 | Decorative Trays | 0.01 | 0.02 | -0.01 |
| 8 | Decorative Cones | 0.01 | 0.01 | 0.00 |
| 9 | Flowers & Plants | 0.01 | 0.01 | 0.00 |
| 10 | Wreath | 0.00 | 0.01 | -0.01 |
| 11 | Doorstoppers | 0.00 | 0.00 | 0.00 |
| 12 | Buddhas | 0.00 | 0.00 | 0.00 |
| 13 | Paperweights | 0.00 | 0.00 | 0.00 |
| 14 | Other | 0.00 | 0.00 | 0.00 |
| 15 | Decorative Bottles | 0.00 | 0.00 | 0.00 |
| 16 | Decorative Letters & Numbers | 0.00 | 0.00 | 0.00 |
| 17 | Bookends | 0.00 | 0.00 | 0.00 |
| 18 | Abstract | 0.00 | 0.00 | 0.00 |